Applying Length-Dependent Stochastic Context-Free Grammars to RNA Secondary Structure Prediction
نویسندگان
چکیده
In order to be able to capture effects from co-transcriptional folding, we extend stochastic context-free grammars such that the probability of applying a rule can depend on the length of the subword that is eventually generated from the symbols introduced by the rule, and we show that existing algorithms for training and for determining the most probable parse tree can easily be adapted to the extended model without losses in performance. Furthermore, we show that the extended model is suited to improve the quality of predictions of RNA secondary structures. The extended model may also be applied to other fields where stochastic context-free grammars are used like natural language processing. Additionally some interesting questions in the field of formal languages arise from it.
منابع مشابه
Stochastic Context-Free Grammars and RNA Secondary Structure Prediction
This thesis focus on the prediction of RNA secondary structure using stochastic context-free grammars (SCFG). The RNA secondary structure prediction problem consists of predicting a 2-dimensional structure from a 1-dimensional nucleotide sequence. The theory behind SCFG is explained and an overview of the research literature on various methods in the field of secondary structure prediction is g...
متن کاملRNA secondary structure prediction and runtime optimization
1. Background RNA secondary structure Pseudoknots Non-coding RNA 2. CONTRAfold: Probabilistic RNA folding Overview of the algorithm Details of the algorithm Performance of CONTRAfold 3. Other RNA folding methods: Physics-based models and Stochastic Context Free Grammars Physics-based models Stochastic Context Free Grammars Advantages of CONTRAfold over these other approaches 4. How RNA folding ...
متن کاملPrediction of RNA Pseudoknotted Secondary Structure using Stochastic Context Free Grammars (SCFG)
Pseudoknots are a frequent RNA structure that assumes essential roles for varied biocatalyst cell’s functions. One of the most challenging fields in bioinformatics is the prediction of this secondary structure based on the base-pair sequence that dictates it. Previously, a model adapted from computational linguistics – Stochastic Context Free Grammars (SCFG) – has been used to predict RNA secon...
متن کاملIntroduction to stochastic context free grammars.
Stochastic context free grammars are a formalism which plays a prominent role in RNA secondary structure analysis. This chapter provides the theoretical background on stochastic context free grammars. We recall the general definitions and study the basic properties, virtues, and shortcomings of stochastic context free grammars. We then introduce two ways in which they are used in RNA secondary ...
متن کاملMaximizing Expected Base Pair Accuracy in RNA Secondary Structure Prediction by Joining Stochastic Context-Free Grammars Method
The identification of RNA secondary structures has been among the most exciting recent developments in biology and medical science. Prediction of RNA secondary structure is a fundamental problem in computational structural biology. For several decades, free energy minimization has been the most popular method for prediction from a single sequence. It is based on a set of empirical free energy c...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Algorithms
دوره 4 شماره
صفحات -
تاریخ انتشار 2011